Goto

Collaborating Authors

 quantization interval





Supplementary Material for Temporal Dynamic Quantization for Diffusion Models 1 Introduction

Neural Information Processing Systems

The following items are provided: A comparison between Dynamic Quantization and TDQ in Section 2. Ablation study on time step encoding in Section 3. Detailed TDQ Module architecture in Section 4. Comparison with multiple quantization interval directly on PTQ in Section 5. Integration of TDQ with various QA T schemes in Section 6. V arious experiments about robustness of the TDQ Module in Section 7. Detailed experimental results on the Output dynamics of the TDQ module in Section 8. Detailed experimental results on the Evolution of Activation Distribution in Section 9. V arious non-cherry-picked results of generated images in Section 10.




Post-Training Non-Uniform Quantization for Convolutional Neural Networks

Luqman, Ahmed, Qazi, Khuzemah, Khan, Imdadullah

arXiv.org Artificial Intelligence

Despite the success of CNN models on a variety of Image classification and segmentation tasks, their extensive computational and storage demands pose considerable challenges for real-world deployment on resource constrained devices. Quantization is one technique that aims to alleviate these large storage requirements and speed up the inference process by reducing the precision of model parameters to lower-bit representations. In this paper, we introduce a novel post-training quantization method for model weights. Our method finds optimal clipping thresholds and scaling factors along with mathematical guarantees that our method minimizes quantization noise. Empirical results on Real World Datasets demonstrate that our quantization scheme significantly reduces model size and computational requirements while preserving model accuracy.


DeepHQ: Learned Hierarchical Quantizer for Progressive Deep Image Coding

Lee, Jooyoung, Jeong, Se Yoon, Kim, Munchurl

arXiv.org Artificial Intelligence

Unlike fixed- or variable-rate image coding, progressive image coding (PIC) aims to compress various qualities of images into a single bitstream, increasing the versatility of bitstream utilization and providing high compression efficiency compared to simulcast compression. Research on neural network (NN)-based PIC is in its early stages, mainly focusing on applying varying quantization step sizes to the transformed latent representations in a hierarchical manner. These approaches are designed to compress only the progressively added information as the quality improves, considering that a wider quantization interval for lower-quality compression includes multiple narrower sub-intervals for higher-quality compression. However, the existing methods are based on handcrafted quantization hierarchies, resulting in sub-optimal compression efficiency. In this paper, we propose an NN-based progressive coding method that firstly utilizes learned quantization step sizes via learning for each quantization layer. We also incorporate selective compression with which only the essential representation components are compressed for each quantization layer. We demonstrate that our method achieves significantly higher coding efficiency than the existing approaches with decreased decoding time and reduced model size.


Zero-delay Consistent Signal Reconstruction from Streamed Multivariate Time Series

Ruiz-Moreno, Emilio, López-Ramos, Luis Miguel, Beferull-Lozano, Baltasar

arXiv.org Artificial Intelligence

Digitalizing real-world analog signals typically involves sampling in time and discretizing in amplitude. Subsequent signal reconstructions inevitably incur an error that depends on the amplitude resolution and the temporal density of the acquired samples. From an implementation viewpoint, consistent signal reconstruction methods have proven a profitable error-rate decay as the sampling rate increases. Despite that, these results are obtained under offline settings. Therefore, a research gap exists regarding methods for consistent signal reconstruction from data streams. This paper presents a method that consistently reconstructs streamed multivariate time series of quantization intervals under a zero-delay response requirement. On the other hand, previous work has shown that the temporal dependencies within univariate time series can be exploited to reduce the roughness of zero-delay signal reconstructions. This work shows that the spatiotemporal dependencies within multivariate time series can also be exploited to achieve improved results. Specifically, the spatiotemporal dependencies of the multivariate time series are learned, with the assistance of a recurrent neural network, to reduce the roughness of the signal reconstruction on average while ensuring consistency. Our experiments show that our proposed method achieves a favorable error-rate decay with the sampling rate compared to a similar but non-consistent reconstruction.


Low Entropy Communication in Multi-Agent Reinforcement Learning

Yu, Lebin, Qiu, Yunbo, Wang, Qiexiang, Zhang, Xudong, Wang, Jian

arXiv.org Artificial Intelligence

Communication in multi-agent reinforcement learning has been drawing attention recently for its significant role in cooperation. However, multi-agent systems may suffer from limitations on communication resources and thus need efficient communication techniques in real-world scenarios. According to the Shannon-Hartley theorem, messages to be transmitted reliably in worse channels require lower entropy. Therefore, we aim to reduce message entropy in multi-agent communication. A fundamental challenge is that the gradients of entropy are either 0 or infinity, disabling gradient-based methods. To handle it, we propose a pseudo gradient descent scheme, which reduces entropy by adjusting the distributions of messages wisely. We conduct experiments on two base communication frameworks with six environment settings and find that our scheme can reduce message entropy by up to 90% with nearly no loss of cooperation performance.